CPRel: Semantic Relatedness Computation Using Wikipedia based Context Profiles

نویسندگان

  • Shahida Jabeen
  • Xiaoying Gao
  • Peter Andreae
چکیده

Semantic relatedness is a well known problem with its significance ranging from computational linguistics to Natural language Processing applications. Relatedness computation is restricted by the amount of common sense and background knowledge required to relate any two terms. This paper proposes a novel model of relatedness using context profile built on features extracted from encyclopedic knowledge. Proposed research makes use of Wikipedia to represent the context of a word in the high dimensional space of Wikipedia labels. Semantic relatedness of a word pair is then assessed by comparing their corresponding context profiles based on three different weighting schemes using traditional Cosine similarity metrics. To evaluate proposed relatedness approach, three well known benchmark datasets are used and it is shown that Wikipedia article contents can be used effectively to compute term relatedness. The experiments demonstrate that the proposed approach is computationally cheap as well as effective when correlated with human judgments.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Measuring of Semantic Relatedness between Words based on Wikipedia Links

A novel technique of semantic relatedness measurement between words based on link structure of Wikipedia was provided. Only Wikipedia’s link information was used in this method, which avoid researchers from burdensome text processing. During the process of relatedness computation, the positive effects of two-directional Wikipedia’s links and four link types are taken into account. Using a widel...

متن کامل

Extracting Semantic Information from Wikipedia Using Human Computation and Dimensionality Reduction

Semantic background knowledge is crucial for many intelligent applications. A classical way to represent such knowledge is through semantic networks. Wikipedia’s hyperlink graph can be considered a primitive semantic network, since the links it contains usually correspond to semantic relationships between the articles they connect. However, Wikipedia is rather noisy in this function. We propose...

متن کامل

Real Time Filtering of Tweets Using Wikipedia Concepts and Google Tri-gram Semantic Relatedness

This paper describes our participation in the mobile notification and email digest tasks in the TREC 2015 Mircoblog track. The tasks are about monitoring Twitter stream and retrieving relevant tweets to users’ interest profiles. Interest profiles contain the description of a topic that the user is interested in receiving relevant posts in real-time. Our proposed approach extracts Wikipedia conc...

متن کامل

Using a Wikipedia-based Semantic Relatedness Measure for Document Clustering

A graph-based distance between Wikipedia articles is defined using a random walk model, which estimates visiting probability (VP) between articles using two types of links: hyperlinks and lexical similarity relations. The VP to and from a set of articles is then computed, and approximations are proposed to make tractable the computation of semantic relatedness between every two texts in a large...

متن کامل

Advertising Keyword Suggestion Using Relevance-Based Language Models from Wikipedia Rich Articles

When emerging technologies such as Search Engine Marketing (SEM) face tasks that require human level intelligence, it is inevitable to use the knowledge repositories to endow the machine with the breadth of knowledge available to humans. Keyword suggestion for search engine advertising is an important problem for sponsored search and SEM that requires a goldmine repository of knowledge. A recen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Research in Computing Science

دوره 70  شماره 

صفحات  -

تاریخ انتشار 2013